The full emotional effect of a movie is mainly based on the music played, in combination with the visual information. So emotions are key elements in movies. When we think of a particular movie we’ve seen, it doesn’t take much to remember certain types of songs played in the movie. Different types of melodies, keys, instruments and many more aspects can produce a very different response in our brain.
In romantic dramas, emotions like ‘loving’ en ‘sense of longing’ are the main characteristics, but sometimes ‘sadness’ also play a role. In horror movies, fear and anxiety are the main emotions expressed in music. Dark overtones will be present. In action movies, emotions with high intensity like excitement are essential. In feelgood-comedy movies, ‘happiness’ and ‘joy’ are the main emotions, so these tracks will contain a lot of musical elements from ‘happy’ music like major tones.
The corpus for this portfolio covers a presentable selection of typical movies within four movie genres. This selection is based on the movie genre categorization of the Internet Movie Database (IMDB). Movies within an IMDB genre with typical features from other genres were excluded, as well as ‘dialogue’ tracks from the Spotify albums. The Romance/Drama playlist contains 218 tracks (12 movies), the Feelgood/Comedy playlist contains 212 tracks (14 movies), the Horror playlist has 234 tracks (10 movies) and the Action playlist has a total of 222 tracks (11 movies). Only Spotify albums from the ‘Original Motion Picture Soundtrack’ were selected.
In this portfolio multiple levels of analysis were performed. First of all, individual tracks from the corpus are being analyzed with for example chromagrams, chordograms and cepstrograms. Lastly, a prediction analysis is done over the whole corpus.
What are the expectations?
A horror movie will be defined as a movie that seeks to scare or unsettle the audience. It is expected that this music is mostly written in the minor key. Minor chords are typically associated with sadness and melancholy. Music in horror movies wobbles and sound deliberately out of tune. For example, a lot of glissandi on violins (the screening upward). Pitch will be destabilized and pitch drops are used to stress the ‘unexpected’. One of the most iconic sounds, is the sudden sforzando tutti crash, designed to shock the audience instantly. It happens often in the midst of a musical silence, or after a pedal note.
Romantic dramas, generally, contain both ‘loving feelings’ and ‘sadness’, that is why tracks in this genre probably will vary between music written in both the major and minor key.
The action movie offers thrills (e.g. shooting) and spectacle (e.g. explosions).
Comedy movies contain overall very happy tracks because of the feelgood vibes. Happy tunes are written in the major key, are louder than other genres and have a high valence.
What is the corpus of this portfolio?
In this small study these questions will be answered:
Select movie genre and mode:
Mode 0 = minor, mode 1 = major
This graphic shows the emotional quadrant of tracks played in movies from movie genres Horror, Action, Feelgood/Comedy and Romance/Drama with color representing the mode and size the loudness of the track. Valence describes the musical positiveness, energy describes the arousal. Overall, louder songs are in the Happy/Joyful section. There is a clear distinction of the Horror and Action genres from the other two genres, the majority of the tracks is displayed at very low valence values. Most of the tracks from Horror movies are concentrated in the Depressing/Sad section of the emotional quadrant graph, with a lot of minor songs and these are overall very ‘quiet’. Tracks from Action movies are more smeared out, but locate mainly in the Angry/Turbulent and Depressing/Sad sections of the graph. Surprisingly, the majority of these tracks do not seem to be very loud at all, which is in contrast with the expectations. Feelgood-Comedy tracks are more scattered, but in comparison with the other genres, this genre has a lot of tracks in the Happy/Joyful section with more louder songs, which is in line with the expectations. The tracks of Romantic/Drama’s are localized mainly throughout the upper right and bottom left of the plot, but do have a low valence overall.
Select specific movies:
These density plots show the distribution of tempo in Beats Per Minute (BPM) for the movie genres. On average, Feelgood/Comedy movies have the highest BPM. The average BPM for Horror movies is the lowest (89 BPM). The distribution for Romance/Drama and Action is about the same. It is also clear that overall, songs in minor key do have a slightly lower BPM than major-key songs, especially in Action movies.
The expectation was that the Action movie has the highest overall tempo from all genres. An explanation for this result is that the average Action movie has both very uptempo tracks, but also very quiet en slow tracks. Slow ‘Horror-like’ tracks are used to build up tension. That is why the distribution for major Action tracks is quite wide. The tempo distribution of minor songs in Action movies is also a surprising result. Apparently slower songs used in the Action movie are mostly written in a minor key. As you may remember, songs written in a minor key are perceived as more depressing. So slow tracks are being used to convey a ‘depressing’ or ‘negative’ feeling and more uptempo tracks are being used to convey more positive, maybe ‘hero-like’ feelings.
| category | mode | median |
|---|---|---|
| Action | Major | 117.594 |
| Action | Minor | 98.890 |
| Feelgood/Comedy | Major | 125.056 |
| Feelgood/Comedy | Minor | 116.313 |
| Horror | Major | 88.599 |
| Horror | Minor | 90.337 |
| Romance/Drama | Major | 108.105 |
| Romance/Drama | Minor | 101.931 |
In a self-similarity matrix each element of the feature sequence is being compared with all other elements. Path-like structures represent exact repetitions. There is one main diagonal visible, this is because both axis represent the exact same song. Block-like structures represent homogeneous regions. This is where music features stay somewhat constant over the duration of an entire musical part.
The left visualizations represent self-similarity matrices for ‘chroma’. It demonstrates at which points in the track the same pitches occur. The right visualizations represent the same song, but with ‘timbre’, also referred to as ‘tone color’. Later on there will be more information about this musical feature.
is a typical Feelgood/Comedy track (what else!) from ‘They Came Together’. The SSM of chroma shows a block-like pattern. At t = 25 some percussion instruments and a piano come to the fore. In the SSM of timbre the last section is very distinguishable. It is clear that a different instrumentation is used. The lead singer stops and a electric guitar starts playing.
is a track from the Romantic/Drama ‘Pride and Prejudice’. This is a good example of exact repetitions. Only diagonal lines are visible, but no big block-like structures. Lines that are diagonal even to the main diagonal are exact repetitions. It is very clear when you listen to this track.
Judge yourself!
Both left grams represent chromagrams. These sum up all pitch coefficients that belong to the same chroma, so this gram cyclic in nature. The graphics to the right represent chordograms with all chords used in the track.
is a Horror track from the movie ‘The Lighthouse’. This track is written in the minor key and has an extremely low valence (0.021). This track has both sections with long lasting pitches and sections with a lot of pitches played at the same time (pitch mix). This is why the track doesn’t sound harmonically ‘correct’ to the human ear, just like the usage of very high and low pitched sounds. However, in these grams high and low pitch coefficients are not distinguished because of the cyclic nature. These characteristics make it a very typical Horror track. The sections are accentuated with white vertical lines.
A typical feature from the Action genre is a build up, to build up some tension. This build up is visible in the chromagram till (t = 0 - t = 53). This change is also associated with a change in keys (see chordogram). There is a fast alternation of different pitches throughout the whole track which is also typical in a Action movie.
Timbre, also known as “tone color”, is the perceived sound quality of a musical note or sound. It distinguishes various types of musical instruments. There are twelve timbre coefficients in total. The values are high level abstractions of the spectral surface ordered by degree of importance.
Doll box is different than the majority of the horror film music. It has a extremely high valence (0.854) for Horror film music. This track is being compared with Curse Your name, which is a very typical horror track with an extremely low valence (0.021). The energy values are more or less the same (0.0625 and 0.0725). Doll box is written in a major key, with very high pitched, distinctive tones.
In Doll Box it is very clear which timbre components are being used in this track; c02, c03 and c04. The sound of this track represents the typical sound of a doll music box. Because these three components are very constant throughout the whole track, it is hard to distinguish the different sound characteristics. Two clear sections in the cepstrogram are the very contrasting parts at t = 1 and at t = 32. These parts are riffs on a copper xylophone.
As you can see, the timbral components of Curse Your Name are much more spread out in the cepstrogram. This is very typical in horror music, because horror music tends to have a lot of different musical characteristics/instruments played at the same time, that seeks to give the audience an uncomfortable and unsettling feeling. The first bright yellow part of co2 is very distinguishable in this track. It represents a wind-instrument, probably a trumpet. Immediately after this part, a somewhat longer c01 appears. This part is a very low-pitched string-instrument, it sounds like a string bass. The yellow part of c05 is a very sharp sound, a high-pitched flute which is very unpleasant to the ear. It is clear that both very high- and low pitched sounds are being used at the same time in typical horror music.
Comparison of c03 in both tracks: this sound is in both tracks a very high-pitched instrument, however, in Doll Box this sound is made by a percussion instrument. In Curse Your Name, this sound sounds more like a wind instrument.
Analysis performed over the first 60 tracks from each genre
When comparing the twelve Spotify timbre coefficients between the four movie genres, the main difference lays in c03, c06 and c11. C03 is the ‘flatness’ of the sound. A low flatness indicates a “spiky” spectrum (mixture of sine waves) and a high flatness indicates that the spectrum has a similar amount of power in all spectral bands (i.e. similar to white noise). It looks like Action movies have a lower flatness overall, meaning that this genre has a bigger mixture of sounds. Feelgood/Comedy and Horror do have more positive values meaning that these genres have less diversity in spectral bands.
Because timbre features are very hard to interpret it is not possible declare coefficients 6 and 11.
Genre classification with all four genres was performed using support vector machines in a ten-fold cross-validation test. Confusion matrices of ‘all features’ (track-level-, timbral- and chroma features), ‘timbral features’ and ‘chroma features’ are being compared. The darker the color grey, the better the prediction. Overall, the chroma features performed the worst among the classification tasks, all features combined predicted movie genre the best.
All features provided the highest genre accuracies for Romance/Drama and Horror. From the confusion matrix, it is clear that there is a clear diagonal dark-grey line, from the upper left to the bottom right. This means, that the model predicted these the best.
Horror and Feelgood/Comedy movies are the most distant when looking at the confusion matrix for all features and timbral features. This makes sense, because Comedies are in general very happy and uptempo, while Horror movie tracks are overall very slow and depressing.
Horror is most often confused with Action movies. This means that the timbral features of Horror movies some what similar are to those of Action movies.
Because the definition of timbral coefficients is somewhat arbitrary, we cannot really conclude what features these exactly are. But when looking at the results of timbre coefficients on the previous chapter (page 7), timbre coefficient c06, c03 and c11 differ across the genres.
Chroma features (pitches) do not seem to predict movie genre very well overall. Music from Action movies is the least distinguishable when looking at chroma features.
From the true Horror tracks, a lot of these tracks are categorized as Action or Feelgood/Comedy. This means that the chroma features from Action look like those of Horror & Feelgood/Comedy movies. It looks like pitches used in Horror and Romantic/Drama tracks are the most distant from each other.
Romantic/Drama tracks differ on chroma features the most. So these songs do have some specific chroma features that most songs contain.
| class | precision | recall |
|---|---|---|
| Action | 0.4705882 | 0.4684685 |
| Feelgood/Comedy | 0.6683938 | 0.6084906 |
| Horror | 0.6299559 | 0.6111111 |
| Romance/Drama | 0.5795918 | 0.6513761 |
| class | precision | recall |
|---|---|---|
| Action | 0.4826087 | 0.5000000 |
| Feelgood/Comedy | 0.5602094 | 0.5047170 |
| Horror | 0.6052632 | 0.5897436 |
| Romance/Drama | 0.5147679 | 0.5596330 |
| class | precision | recall |
|---|---|---|
| Action | 0.3519313 | 0.3693694 |
| Feelgood/Comedy | 0.3603239 | 0.4198113 |
| Horror | 0.4221106 | 0.3589744 |
| Romance/Drama | 0.5603865 | 0.5321101 |
The decision tree is a prediction model that is used in machine learning. The goal is to create a model that predicts the value of a target variable based on several input variables.
| class | precision | recall |
|---|---|---|
| Action | 0.6526946 | 0.4909910 |
| Feelgood/Comedy | 0.6791667 | 0.7688679 |
| Horror | 0.6806084 | 0.7649573 |
| Romance/Drama | 0.7037037 | 0.6972477 |
From the decision tree classification, the precision and recall for the different genres are much better than the ones for the confusion matrices. The recall for the Horror and Feelgood/Comedy tracks are the best, with a total accuracy of almost 76%. So 76% of the tracks that belong to Horror and Feelgood/Comedy films, are correctly categorized. The prediction is lowest for Action movies.
From the forest model, it is very clear that there are some features that really characterize the four movie genres. The top 6 features are:
It seems that timbre c06 is an important component in distinguishing movie genres. However, because timbre components above c04 are very hard to identify, it is not very clear what aspect this exactly is. It looks like valence is the best predictor of all Spotify features.
On the next page, a new analysis on these 5 components is being performed.
In the table the top 5 features from the forest model are being compared with all features. It looks like this increased the precision and recall for all genres, especially for the Romance/Drama. However, the Action genre doesn’t improve that much and is still being confused with the Horror genre.
| class | precision | recall |
|---|---|---|
| Action | 0.4476190 | 0.4234234 |
| Feelgood/Comedy | 0.5673077 | 0.5566038 |
| Horror | 0.5226337 | 0.5427350 |
| Romance/Drama | 0.4666667 | 0.4816514 |
| class | precision | recall |
|---|---|---|
| Action | 0.4705882 | 0.4684685 |
| Feelgood/Comedy | 0.6683938 | 0.6084906 |
| Horror | 0.6299559 | 0.6111111 |
| Romance/Drama | 0.5795918 | 0.6513761 |
This small study presents a preliminary examination on a corpus of music collected from film scores in four genres, from a total of 47 movies (Action, Romance/Drama, Horror and Feelgood/Comedy) utilizing all kinds of music representations. From track-level-features, chroma and timbre self-similarity matrices, to musical keys and tempo, to classification estimates using confusion matrices and a forest model.
From the analysis of musical features from film music we can conclude the following:
The results support the notion that high intensity movies (i.e. Action and Horror) have musical cues that are measurably different from the musical scores for movies with more measured expressions of emotion (i.e. Romance/Drama and Feelgood/Comedy).
Even when using very distinct movie genres, it is clear that such a labeling scheme is likely too broad as several tracks within a specific genre may exhibit characteristics of music from another genre (e.g. an action scene in a romantic drama movie, or vice versa). A more close examination of each individual track will probably serve to improve classification accuracy. However, this small study definitely gave some insights in the main characteristics of typical movie genres.
The creator
This portfolio was made by me, Iris, a psychology student who follows this course ‘Computational Musicology’ as part of the Minor Artificial Intelligence. When I started this course, I didn’t quite know where it would bring me. The only thing I new was that we were going to do something with Spotify features, but that was it. At first I didn’t even realize what the possibilities were with the Spotify API. In the first online lectures I realized that it was even bigger than I ever could imagine! It got me excited to play around with all the possibilities and incorporate it into my own portfolio.
I had no previous coding experiences, only very minimal R skills from my bachelor in Psychology and some Python skills. With this course I’ve made huge steps in coding in R and this is definitely something that will benefit my future career in psychology. It was sometimes a challenge, but I enjoyed learning this new skill and possibilities with R and the Spotify API, and I’m very happy with the results.
I especially wanted to choose a music subject that was somehow related to the field of psychology. But how would I incorporate this in such a computational course, without participants? After some brainstorming I came up with the idea of using film music. I really enjoyed making this portfolio, and I hope you enjoyed reading it too!